Clustering and Active Learning Using a LSI Subspace
نویسندگان
چکیده
.......................................................................................................... xiv CHAPTER1: Introduction......................................................................................... 1 1.1 Latent Semantic Indexing .......................................................................... 4 1.2 Visual Exploration of the LSI Subspaces.................................................. 8 1.3 Research Questions................................................................................... 13 1.4 Organization of the Thesis ....................................................................... 15 CHAPTER2: Literature Survey.............................................................................. 16 2.1 Data Models for Semantic Content Representation .............................. 16 2.2 Text Clustering.......................................................................................... 21 2.3 Active Learning......................................................................................... 23 2.4 Query Expansion....................................................................................... 25 2.5 Social Network Analysis ........................................................................... 27 CHAPTER3: The LSI Subspace Signature Model ................................................ 29 3.1 LSI Subspace Term Signatures and Document Signatures .................. 31 3.2 LSI Subspace Signature Ranking............................................................ 35 3.2.
منابع مشابه
Document clustering using the LSI subspace signature model
We describe the Latent Semantic Indexing Subspace Signature Model (LSISSM) for semantic content representation of unstructured text. Grounded on Singular Value Decomposition (SVD), the model represents terms and documents by the distribution signatures of their statistical contribution across the topranking latent concept dimensions. LSISSM matches term signatures with document signatures accor...
متن کاملHigh-Dimensional Unsupervised Active Learning Method
In this work, a hierarchical ensemble of projected clustering algorithm for high-dimensional data is proposed. The basic concept of the algorithm is based on the active learning method (ALM) which is a fuzzy learning scheme, inspired by some behavioral features of human brain functionality. High-dimensional unsupervised active learning method (HUALM) is a clustering algorithm which blurs the da...
متن کاملAn Improved Semi-supervised Clustering Algorithm Based on Active Learning
In order to solve the difficult questions such as in the presence of the cluster deviation and high dimensional data processing in traditional semi-supervised clustering algorithm, a semi-supervised clustering algorithm based on active learning was proposed, this algorithm can effectively solve the above two problems. Using active learning strategies in algorithm can obtain a large amount of in...
متن کاملLearning Robust Subspace Clustering
We propose a low-rank transformation-learning framework to robustify subspace clustering. Many high-dimensional data, such as face images and motion sequences, lie in a union of low-dimensional subspaces. The subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to their underlying low-dimensional subspaces....
متن کاملLearning Transformations for Clustering and Classification Learning Transformations for Clustering and Classification
A low-rank transformation learning framework for subspace clustering and classification is here proposed. Many high-dimensional data, such as face images and motion sequences, approximately lie in a union of low-dimensional subspaces. The corresponding subspace clustering problem has been extensively studied in the literature to partition such highdimensional data into clusters corresponding to...
متن کامل